WoSIT: A Word Sense Induction Toolkit for Search Result Clustering and Diversification
نویسندگان
چکیده
In this demonstration we present WoSIT, an API for Word Sense Induction (WSI) algorithms. The toolkit provides implementations of existing graph-based WSI algorithms, but can also be extended with new algorithms. The main mission of WoSIT is to provide a framework for the extrinsic evaluation of WSI algorithms, also within end-user applications such as Web search result clustering and diversification.
منابع مشابه
SemEval-2013 Task 11: Word Sense Induction and Disambiguation within an End-User Application
In this paper we describe our Semeval-2013 task on Word Sense Induction and Disambiguation within an end-user application, namely Web search result clustering and diversification. Given a target query, induction and disambiguation systems are requested to cluster and diversify the search results returned by a search engine for that query. The task enables the end-to-end evaluation and compariso...
متن کاملSemantic is Beautiful: Clustering and Diversifying Search Results with Graph-based Word Sense Induction
Web search result clustering aims to facilitate information search on the Web. Rather than presenting the results of a query as a flat list, these are grouped on the basis of their similarity and subsequently shown to the user as a list of possibly labeled clusters. Each cluster is supposed to represent a different meaning of the input query, thus taking into account the language ambiguity issu...
متن کاملClustering Web Search Results with Maximum Spanning Trees
We present a novel method for clustering Web search results based on Word Sense Induction. First, we acquire the meanings of a query by means of a graph-based clustering algorithm that calculates the maximum spanning tree of the co-occurrence graph of the query. Then we cluster the search results based on their semantic similarity to the induced word senses. We show that our approach improves c...
متن کاملMultilingual Word Sense Induction to Improve Web Search Result Clustering
In [12] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea of, first, automatically inducing senses for the target query and, second, clustering the search results based on their semantic similarity to the word senses induced. In [1] we proposed an innov...
متن کاملClustering and Diversifying Web Search Results with Graph-Based Word Sense Induction
Web search result clustering aims to facilitate information search on the Web. Rather than the results of a query being presented as a flat list, they are grouped on the basis of their similarity and subsequently shown to the user as a list of clusters. Each cluster is intended to represent a different meaning of the input query, thus taking into account the lexical ambiguity (i.e., polysemy) i...
متن کامل